Goto

Collaborating Authors

 Wisconsin



OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Neural Information Processing Systems

Scoring the Optical Character Recognition (OCR) capabilities of Large Multimodal Models (LMMs) has witnessed growing interest. Existing benchmarks have highlighted the impressive performance of LMMs in text recognition; however, their abilities in certain challenging tasks, such as text localization, handwritten content extraction, and logical reasoning, remain underexplored. To bridge this gap, we introduce OCRBench v2, a large-scale bilingual text-centric benchmark with currently the most comprehensive set of tasks (4 more tasks than the previous multi-scene benchmark OCRBench), the widest coverage of scenarios (31diverse scenarios), and thorough evaluation metrics, with 10,000human-verified questionanswering pairs and a high proportion of difficult samples. Moreover, we construct a private test set with 1,500 manually annotated images. The consistent evaluation trends observed across both public and private test sets validate the OCRBench v2's reliability. After carefully benchmarking state-of-the-art LMMs, we find that most LMMs score below 50 (100 in total) and suffer from five-type limitations, including less frequently encountered text recognition, fine-grained perception, layout perception, complex element parsing, and logical reasoning.


Global Minimizers of โ„“p-Regularized Objectives Yield the Sparsest ReLU Neural Networks

Neural Information Processing Systems

Overparameterized neural networks can interpolate a given dataset in many different ways, prompting the fundamental question: which among these solutions should we prefer, and what explicit regularization strategies will provably yield these solutions? This paper addresses the challenge of finding the sparsest interpolating ReLU network--i.e., the network with the fewest nonzero parameters or neurons--a goal with wide-ranging implications for efficiency, generalization, interpretability, theory, and model compression. Unlike post hoc pruning approaches, we propose a continuous, almost-everywhere differentiable training objective whose global minima are guaranteed to correspond to the sparsest singlehidden-layer ReLU networks that fit the data. This result marks a conceptual advance: it recasts the combinatorial problem of sparse interpolation as a smooth optimization task, potentially enabling the use of gradient-based training methods. Our objective is based on minimizing โ„“p quasinorms of the weights for 0 < p < 1, a classical sparsity-promoting strategy in finite-dimensional settings. However, applying these ideas to neural networks presents new challenges: the function class is infinite-dimensional, and the weights are learned using a highly nonconvex objective. We prove that, under our formulation, global minimizers correspond exactly to sparsest solutions. Our work lays a foundation for understanding when and how continuous sparsity-inducing objectives can be leveraged to recover sparse networks through training.


Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

Neural Information Processing Systems

Large Language Models (LLMs) are increasingly used to simulate human users in interactive settings such as therapy, education, and social role-play. While these simulations enable scalable training and evaluation of AI agents, off-the-shelf LLMs often drift from their assigned personas, contradict earlier statements, or abandon role-appropriate behavior. We introduce a unified framework for evaluating and improving persona consistency in LLM-generated dialogue. We define three automatic metrics--prompt-to-line consistency, line-to-line consistency, and Q&A consistency--that capture different types of persona drift and validate each against human annotations. Using these metrics as reward signals, we apply multiturn reinforcement learning to fine-tune LLMs for three user roles: a patient, a student, and a social chat partner. Our method reduces inconsistency by over 55%, resulting in more coherent, faithful, and trustworthy simulated users.


Pioneering UK Nerve Lab harnesses AI to map effect of children's screen time

The Guardian

Tim Smith: 'Today's short-form, fast-paced, highly captivating content may affect children's attention, comprehension and emotional response'. Tim Smith: 'Today's short-form, fast-paced, highly captivating content may affect children's attention, comprehension and emotional response'. Pioneering UK Nerve Lab harnesses AI to map effect of children's screen time P arents are constantly being told to limit their children's screen time. A relatively slow-paced programme such as Bluey offers a very different viewing experience to a fast-moving action series such as PAW Patrol, yet both are broadly considered suitable for young children. This challenge is growing as the type of content children are exposed to evolves.


Inside soccer's data renaissance

MIT Technology Review

Many of the insights hitting soccer pitches today trace back to Jesse Davis and a team of computer scientists open-sourcing tools for some of the sport's trickiest problems. Imagine tuning in to the opening kickoff of a World Cup match and seeing a player intentionally send the ball all the way down the pitch and right out of bounds on the opponent's end. Casual fans might scratch their heads. If you were Jesse Davis, though, you'd know that this play could be a prime setup to score. Davis is a professor of computer science at KU Leuven in Belgium and head of its Sports Analytics Lab, which has been at the vanguard of a data awakening in soccer since its inception more than a decade ago. Though the research group brings machine-learning models to bear on a variety of sports--including basketball, volleyball, and field hockey--nowhere is its impact felt more than on the soccer pitch.


Atom-based quantum computers are catching up in the race to usefulness

New Scientist

Some of the optical components used in Atom Computing's quantum computer The race to build the first truly useful quantum computer just got more exciting. A quantum computer made from extremely cold atoms has now passed some of the most important milestones towards usefulness, joining a small group of equally able and promising machines. Though there is wide agreement that sufficiently powerful quantum computers would transform our ability to discover new materials and drugs, and break the encryption that underpins the internet, there are many competing ideas about how best to build them. Industry mainstays such as Google and IBM have spent a decade building quantum computers from tiny superconducting circuits, and this approach is currently the front-runner. But an alternate approach that uses electrically neutral ultracold atoms has recently been gaining traction.


Americans echo Pope Leo's concerns about AI: 'It threatens workers, privacy and human life'

The Guardian

Pope Leo XIV speaks during a meeting with bishops, members of the clergy and families whose members have been victims of environmental pollution at the Cathedral of Santa Maria Assunta, in Acerra, Italy, on 23 May 2026. Pope Leo XIV speaks during a meeting with bishops, members of the clergy and families whose members have been victims of environmental pollution at the Cathedral of Santa Maria Assunta, in Acerra, Italy, on 23 May 2026. Americans echo Pope Leo's concerns about AI: 'It threatens workers, privacy and human life' Guardian readers in the US spoke of fears about unregulated AI in response to the pope's encyclical warning about the risks of the technology I n his first major papal text since assuming leadership of the Catholic church last year, Pope Leo issued a stark warning about the rise of artificial intelligence this week, denouncing the "culture of power" driving the AI age. Calling for the "most rigorous" ethical constraints on AI - which he described as one of the greatest threats facing humanity today - the first US-born pope also warned of "new forms of slavery" emerging through the digital economy. Speaking to the Guardian, readers in the US echoed the pope's concerns, describing AI as an "unregulated" industry increasingly being used to the "detriment of too many people", while also raising fears about surveillance, labor displacement, war and environmental harm .


Brewers pitcher Abner Uribe gets a one-game suspension for crotch-chopping celebration

FOX News

Steve Hilton rips Steyer for trans athlete support, leads'Save Girls Sports' rally at track title meet Umpire Dan Bellino's baffling foul tip call on Seiya Suzuki renews calls for robot review in MLB Dakich: sports media has created an'industry' out of complaining about white athletes like Caitlin Clark Greg Sankey insists SEC is'strongest league' despite Big Ten winning three straight national championships Phillies look to upset Dodgers behind Zack Wheeler as Philadelphia's turnaround continues in LA Joey McGuire calls Steve Sarkisian's bluff, dares Texas to play Texas Tech in Week 1 Jesse Watters: Biden'broke' the Democratic Party'Friday Follies': Bruce Springsteen offers political commentary during concert Dr Oz speaks on how to fight fraud when leaders protect'innocent fraudsters' Mollie Hemingway: It's insane to hear Jill Biden say this BEHIND CLOSED DOORS: Iran talks face crucial test as all eyes stay fixed on Trump's next move DOJ looking for protestor who allegedly threatened to kill ICE agent's family Spencer Pratt is selling'basic common sense,' Clay Travis says Trump questions whether his'strong performance' in 2024 debate made Biden choke Greg Gutfeld on Dem joke: Men don't go where they aren't wanted Greg Gutfeld: Don't you just hate billionaires? The Milwaukee Brewers will be without pitcher Abner Uribe for one game after he was suspended for breaking one of baseball's biggest unwritten rules: Never direct a Kenny Powers/ D-Generation X crotch chop at the opposing dugout. Abner Uribe's controversial celebration divided baseball fans, sparked retaliation fears, and drew criticism from his own manager. Uribe hit the iconic celebration this weekend after an inning-ending strikeout against the St. Louis Cardinals . Brewers' manager Pat Murphy did not condone the celebration, and Uribe himself apologized for it .


Parameter-Efficient Generative Modeling with Controlled Vector Fields

arXiv.org Machine Learning

We introduce a continuous-time generative modeling framework, motivated by the Chow-Rashevskii theorem, that builds expressive flows from a small set of fixed vector fields and learned scalar controls. Instead of learning an unconstrained high-dimensional vector field, our framework constructs the velocity by modulating fixed vector fields with learned scalar control functions. When the fixed fields are bracket-generating, their Lie algebra spans the ambient space, providing a mechanism for expressive transport with only a small number of learned control channels and offering a parameter-efficient geometric alternative to standard vector-field parameterizations. This decoupled formulation yields a structured and interpretable generative model in which the number of learned scalar output channels can be chosen independently of the ambient dimension. We formulate an expressivity principle showing that, under suitable controllability and well-posedness assumptions, such controlled flows can transport a source distribution to a target distribution. We train the resulting model using a continuous-normalizing-flow likelihood objective and present proof-of-concept experiments on synthetic distributions.